Skip to content

Fix no-onig no-wasm builds #1772

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 27, 2025

Conversation

414owen
Copy link
Contributor

@414owen 414owen commented May 1, 2025

Now, you can build with:

--no-default-features --features=fancy-regex

Which previously didn't work. You had to enable the unstable_wasm flag.

I think using fancy_regex without wasm is a valid use-case, as I've seen extremely slow build times using onig.
See: #1730

Onig also breaks, sometimes, with compiler updates. See: #1771

Closes #1729

Now, you can build with:

```
--no-default-features --features=fancy-regex
```

Which previously didn't work. You had to enable the `unstable_wasm`
flag.

I think using `fancy_regex` without wasm is a valid use-case, as I've
seen extremely slow build times using `onig`.
See: huggingface#1730

Onig also breaks, sometimes, with compiler updates.
See: huggingface#1771

Closes huggingface#1729
mmoskal pushed a commit to guidance-ai/llguidance that referenced this pull request May 6, 2025
The version of Oniguruma used in `onig_sys` doesn't build on GCC 15 and
the oniguruma project itself got archived last week, so this PR switches
tokenizers to the fancy-regex backend.

`fancy-regex` also requires flipping on the `unstable_wasm` feature
until huggingface/tokenizers#1772 lands, that flag doesn't have any ill
effects though since everything WASM related downstream is behind
`target_arch` checks.

**tl;dr**: This fixes builds on Linux distros with newer GCC versions
like Arch Linux and Fedora.
@adamreichold
Copy link

Considering the archiving of the upstream Oniguruma repository and its build problems using GCC 15, I wonder whether the better course of action here isn't to just unconditionally depend on fancy-regex further simplifying the whole setup.

@414owen
Copy link
Contributor Author

414owen commented May 26, 2025

@adamreichold yes, I would also be in favour of that.

@ArthurZucker can we please get your, or someone else's, eyes on this?

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry late review, thanks!
I'd rather we keep both for now for some weird hardwares

@ArthurZucker ArthurZucker merged commit 67db0cd into huggingface:main May 27, 2025
@ArthurZucker
Copy link
Collaborator

This breaks cargo build --all-targets --all-features I'll fix it to have both

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Building without onig feature fails
3 participants